# Lightweight inference
Baidu ERNIE 4.5 0.3B PT GGUF
Apache-2.0
A quantized version based on the Baidu ERNIE-4.5-0.3B-PT model, optimized through the llama.cpp tool to reduce the model size and improve the running efficiency.
Large Language Model Supports Multiple Languages
B
bartowski
314
3
Echelon AI Med Qwen2 7B GGUF
This project provides the GGUF quantized file for the Echelon-AI/Med-Qwen2-7B model, supported by Featherless AI, aiming to enhance model performance and reduce operating costs.
Large Language Model
E
featherless-ai-quants
183
1
Devstral Small 2505 3bit
Apache-2.0
This is a 3-bit quantized version converted from the mistralai/Devstral-Small-2505 model, designed for the MLX framework and supports multilingual text generation tasks.
Large Language Model Supports Multiple Languages
D
mlx-community
121
1
Qwen3 0.6B GGUF
Apache-2.0
Qwen3 is the latest version of the Tongyi Qianwen series of large language models, offering a range of dense and Mixture of Experts (MoE) models. Based on large-scale training, Qwen3 has achieved breakthrough progress in reasoning capabilities, instruction following, agent functionalities, and multilingual support.
Large Language Model English
Q
prithivMLmods
290
1
Kimi VL A3B Thinking 6bit
Other
Kimi-VL-A3B-Thinking-6bit is a multilingual vision-language model converted based on the MLX format, supporting image-text to text tasks.
Image-to-Text
Transformers Other

K
mlx-community
135
0
3b Ko Ft Research Release Q4 K M GGUF
Apache-2.0
This is a 3B-parameter language model optimized for Korean, converted to GGUF format for compatibility with llama.cpp.
Large Language Model Korean
3
freddyaboulton
165
0
Llama 4 Scout 17B 16E Instruct GGUF
Other
Llama-4-Scout-17B-16E-Instruct-GGUF is a quantized version based on the Llama-4-Scout-17B-16E-Instruct model, supporting multilingual processing and suitable for chat and instruction tasks.
Large Language Model
Transformers Supports Multiple Languages

L
gaianet
2,263
0
LBM Depth
Image depth estimation model based on Latent Bridging Matching (LBM) technology, achieving rapid image transformation through latent space bridging
3D Vision
L
jasperai
44
3
Zhaav Gemma3 4B
A Persian-specific model fine-tuned based on the Gemma 3 architecture, utilizing QLoRA 4-bit quantization technology, suitable for running on ordinary hardware.
Large Language Model Other
Z
alifzl
40
1
Mistral Small 3.1 24b Instruct 2503 Hf GGUF
This is a GGUF format quantized version of the mrfakename/mistral-small-3.1-24b-instruct-2503-hf model, suitable for text generation tasks.
Large Language Model
M
MaziyarPanahi
137.78k
2
Gemma 3 4b Pt Q4 0 GGUF
This is a GGUF format model converted from Google's Gemma 3.4B parameter model, suitable for text generation tasks.
Large Language Model
G
ngxson
74
1
Gemma 3 4b It GGUF
Gemma 3.4B IT is a lightweight open-source large language model released by Google. Based on a parameter scale of 3.4B, it is suitable for dialogue and instruction following tasks.
Large Language Model
Transformers

G
tensorblock
395
0
Phi 4 Multimodal Instruct
MIT
Phi-4-multimodal-instruct is a lightweight open-source multimodal foundation model that integrates language, vision, and speech research data from Phi-3.5 and 4.0 models. It supports text, image, and audio inputs to generate text outputs, with a context length of 128K tokens.
Text-to-Audio
Transformers Supports Multiple Languages

P
microsoft
584.02k
1,329
Selene 1 Mini Llama 3.1 8B Q6 K GGUF
GGUF format model converted from AtlaAI/Selene-1-Mini-Llama-3.1-8B, suitable for text generation tasks and supports multiple European languages.
Large Language Model Supports Multiple Languages
S
NikolayKozloff
127
3
USER Bge M3 Q8 0 GGUF
Apache-2.0
This model is converted from deepvk/USER-bge-m3 into GGUF format, primarily used for sentence similarity and feature extraction tasks.
Text Embedding Other
U
alela32
39
1
Flan T5 Base Q4 K M GGUF
Apache-2.0
This model is a GGUF format version converted from Google's flan-t5-base model, supporting multiple languages and tasks, suitable for text generation and reasoning tasks.
Large Language Model Supports Multiple Languages
F
marroyo777
23
1
Gemma 2 Baku 2b It
An instruction fine-tuned model based on Gemma 2 Baku 2B, which optimizes the instruction following ability and is suitable for natural language processing tasks.
Large Language Model
Transformers Japanese

G
rinna
2,555
27
Llama 3.2 1B Instruct Q8 0 GGUF
This is Meta's 1 billion parameter instruction-tuned model from the Llama 3.2 series, converted to GGUF format for use with llama.cpp
Large Language Model Supports Multiple Languages
L
hugging-quants
64.04k
31
Llm Jp 3 1.8b Instruct
Apache-2.0
A large language model developed by the National Institute of Informatics in Japan, supporting Japanese and English, with instruction fine-tuning capabilities.
Large Language Model
Transformers Supports Multiple Languages

L
llm-jp
2,759
25
USER Bge M3 Q4 K M GGUF
Apache-2.0
This model is converted from deepvk/USER-bge-m3 to GGUF format, primarily used for sentence similarity calculation and feature extraction.
Text Embedding Other
U
cm4ker
117
2
Robbert 2022 Dutch Sentence Transformers Onnx
ONNX version of the Dutch Forensic Institute's RobBERT-2022 Dutch sentence embedding model, optimized for high speed and lightweight performance.
Text Embedding
R
LightEmbed
18
1
Faster Whisper Large V2
MIT
This is the CTranslate2 converted version of OpenAI Whisper large-v2 model for efficient speech recognition
Speech Recognition Supports Multiple Languages
F
guillaumekln
161.19k
199
Sbert Pq
A model based on sentence-transformers for determining the relevance between short texts and questions.
Text Embedding
Transformers Other

S
inkoziev
268
17
Resnet Tiny Beans
Apache-2.0
An ultra-small model trained on a bean dataset, primarily for testing and demonstration purposes.
Large Language Model
Transformers

R
fxmarty
5,865
1
Featured Recommended AI Models